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Abstract 



The theory of multidimensional persistence captures the topology of a multifiltration - a multiparameter family of 
increasing spaces. Multifiltrations arise naturally in the topological analysis of scientific data. In this paper, we give 
a polynomial time algorithm for computing multidimensional persistence. We recast this computation as a problem 
within computational commutative algebra and utilize algorithms from this area to solve it. While the resulting 
problem is ExPSPACE-complete and the standard algorithms take doubly-exponential time, we exploit the structure 
inherent withing multifiltrations to yield practical algorithms. We implement all algorithms in the paper and provide 
statistical experiments to demonstrate their feasibility. 

1 Introduction 

In this paper, we give a polynomial time algorithm for computing the persistent homology of a multifiltration. The 
computed solution is compact and complete, but not an invariant. Theoretically, this is the best one may hope for 
because complete compact invariants do not exist for multidimensional persistence. We also discuss computing an 
incomplete invariant, the rank invariant, giving an algorithm for reading off this invariant from our complete solution. 
We implement all algorithms in the paper and provide statistical experiments to demonstrate their feasibility. 

1.1 Motivation 

Intuitively, a multifiltration models a growing space that is parameterized along multiple dimensions. For example, 
the complex with coordinate (3, 2) in FigureQ]is filtered along the horizontal and vertical dimensions, giving rise to 
a bifiltration. Multifiltrations arise naturally in topological analysis of scientific data. Often, scientific data is in the 
form of a finite set of noisy samples from some underlying topological space. Our goal is to robustly recover the 
lost connectivity of the underlying space. If the sampling is dense enough, we approximate the space as a union of 
balls by placing e-balls around each point. As we increase e, we obtain a growing family of spaces, a one-parameter 
multifiltration also called a filtration. This approximation is the central idea behind many methods for computing the 
topology of a point set, such as Cech, Rips-Vietoris lfT2l . or witness Q complexes. Often, however, the input point set 
is filtered via multiple functions. For example, in analyzing the structure of natural images, we filter the data according 
to density [1|. We now have multiple dimensions along which our space is filtered; that is, we have a multifiltration. 

1.2 Prior Work 

For one-dimensional filtrations, the theory of persistent homology provides a complete invariant called a barcode, 
a multiset of intervals J22). Each interval in the barcode corresponds to the lifetime of a single topological feature 
within the filtration. Since intrinsic features have long lives, while noise is short-lived, a quick examination of the 
intervals gives a robust estimation of the topology. The existence of a complete compact invariant, as well as efficient 
algorithms and fast implementations have led to successful application of persistence to a variety of disciplines, such 

*The authors were partially supported by the following grants: G. C. by NSF DMS-0354543; A. Z. by DARPA HR 0011-06-1-0038, ONR 
N 00014-08-1-0908, and NSF CCF-0845716; all by DARPA HR 001 1-05-1-0007. 
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Figure 1: A bifiltration. The complex with labeled vertices is at coordinate (3, 2). Simplices are highlighted and named at 
the critical coordinates that they appear. 



as shape description (5), denoising volumetric density data |fT3l , detecting holes in sensor networks [8], analyzing 
neural activity in the visual cortex [18], and analyzing the structure of natural images [ 1 1, to name a few. 

For multifiltrations of dimension higher than one, the situation is much more complicated. The theory of multidi- 
mensional persistence shows that no complete discrete invariant exists, where discrete means that the structure of the 
target for the invariant does not depend on the structure of the underlying field of coefficients [2]. Instead, the authors 
propose an incomplete invariant, the rank invariant, which captures important persistent information. Unfortunately, 
this invariant is not compact, requiring large storage, so its direct computation using the one-dimensional algorithm is 
not feasible. A variant of the problem of multidimensional persistence appears in computer vision [ 10 1. There is also 
a partial solution called vineyards [4|. A full solution, however, has not been attempted by any prior work. 

1.3 Contributions 

In this paper, we provide a complete solution to the problem of computing multidimensional persistence. 

• We recast persistence as a problem within computational commutative algebra, allowing us to utilize powerful 
algorithms from this area. 

• We exploit the structure provided by a multifiltration to greatly simplify the algorithms. 

• We show that the resulting algorithms are polynomial time, unlike their original counterparts, which are Ex- 
PSPACE-complete, requiring exponential space and time. 

• We implement all algorithms and show that the multidimensional setting requires different data structures than 
the one-dimensional case for efficient computation. In particular, the change in approach allows for paralleliza- 
tion. 

• We provide a suite of statistical tests with random multifiltrations and analyze the running time in practice. 

As such, our work provides a full integrated solution, from a new theoretical understanding in computational commu- 
tative algebra, to the refinement of algorithms, full implementation, and experimentation. 

We begin with background on multidimensional persistence in Section [2] In Section [3] we formalize the input 
to our algorithms and justify it. In Section |4j we reinterpret the problem of computing multidimensional persistence 
within computational commutative algebra. Having recast the problem, we use algorithms from this area to solve 
the problem. This gives us a computationally intractable solution. In Section [5] we simplify the algorithms by using 
the structure within multifiltrations. This simplification allows us to derive a polynomial time algorithm from the 
original doubly-exponential algorithms. In Section|7] we describe our implementations of these algorithms and provide 
experiments to validate our work in practice. 
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2 Background 



In this section, we provide the background on multidimensional persistence. We begin by formalizing multifiltrations. 
We then describe simplicial homology, persistent homology, and multidimensional persistence, as well as the definition 
of the rank invariant. 

2.1 Multifiltrations 

Let N C Z be the set of non-negative integers. For vectors in N n or M. n , we say u < v if «, < Vi for 1 < i < n. A 
topological space X is multifiltered if we are given a family of subspaces {X u } u , where u £ N n and X u C X such 
that for u,wi,W2,v £ W\ the diagrams 

X u y X Wl 

(1) 

"V" 4^ 

commute whenever u < wt, W2 < v. We call the family of subspaces {X u } u a multifiltration. A one-dimensional 
multifiltration is called a. filtration. We show an example of a bifiltration of a simplicial complex in Figure Q] In a 
multifiltration, any path with monotonically increasing coordinates is a filtration, such as any row or column in the 
figure. Multifiltrations constitute the input to our algorithms. We further motivate their use as a model for scientific 
data in the next section. 



2.2 Homology 

Given a topological space, homology is a topological invariant that is often used in practice as it is easily computable. 
Here, we describe simplicial homology briefly, referring the reader to Hatcher fl4l as a resource. We assume our input 
is a simplicial complex K, such as the complexes in FigureQ] We note, however, that our results carry over to arbitrary 
cell complexes, so we use cell complexes when appropriate in our presentation. The ith chain group Ci (K) of K is the 
free Abelian group on K's set of oriented i-simplices. An element c <E Ci(K) is an i-chain, c = . rij[aj}, Oj 6 K 
with coefficients rij £ Z. Given such a chain c, the boundary operator di : Gi(K) — > d-i(K) is a homomorphism 
defined linearly by its action on any simplex a = [vo, vi, . . . , vi\ £ c, 

did = ^(-l) J [u ,...,«i,...,«i], (2) 

3 

where Vj indicates that Vj is deleted from the vertex sequence. The boundary operator connects the chain groups into 
a chain complex C» : 

■ •• -> C i+1 (K) ^ d(K) ^ Ct-iiK) - • • • . (3) 

Using the boundary operator, we may define subgroups of Cf. the cycle group kerdi and the the boundary group 
imdi+i. Since didi+i = 0, then im9j+i C kerc^ C d(K). The if/i homology group is 

ff i ( J fC)=ker^/im9i + i (4) 

and the if/i Beff/ number is (3i(K) = rank Hi(K), Over field coefficients fc, is a fc-vector space of dimension Pi. 
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2.3 Persistent Homology 

Given a multifiltration {!„}„, the homology of each subspace X u over a field k is a vector space. For instance, the 
bifiltered complex in FigureQ]has zeroth homology vector spaces isomorphic to the commutative diagram 

k 2 — ► k 2 — > k > k 

T T T T 

k 2 — > k 4 — > k 3 — > k 3 

T T T T 

k* —> t 3 —> t 2 —> k 2 



where the dimension of the vector space counts the number of components of the complex, and the maps between 
the homology vector spaces are induced by the inclusion maps relating the subspaces. Persistent homology captures 
information contained in the induced maps. There are two equivalent definitions that we use in this paper. The first 
definition was originally for filtrations only [9], but was later extended to multifiltrations [2]. The key idea is to relate 
homology between a pair of complexes. For each pair u, v G N n with u < v, X u C X v by definition, so X u <^-> X v . 
This inclusion, in turn, induces a linear map Li(u,v) at the ith homology level Hi(X u ) — > Hi(X v ) that maps a 
homology class within X u to the one that contains it within X v . The ith persistent homology is im tj, the image of 
for all pairs u < v. This definition also enables the definition of an incomplete invariant. The ith rank invariant is 

Pi{u,v) = rank Oi(u,v), (5) 

for all pairs u < v G N n , where tj is defined above 1121 . 

While the above definition provides intuition, it is inexpedient for theoretical development. For most of our paper, 
we use a second definition of persistence that is grounded in algebraic topology, allowing us to utilize tools from 
commutative algebra for computation Il22l l2l. 



2.4 Multidimensional Persistence 

The key insight for the second definition is that the persistent homology of a multifiltration is, in fact, the homology 
of a single algebraic entity. This object encodes all the complexes via polynomials that track cells through the multi- 
filtration. To define our algebraic structure, we need to first review graded modules over polynomials. A monomial in 
x\ , . . . , x n is a product of the form 

(6) 

with Vi G N. We denote it x v , where v = (vi, . . . , v n ) G N™. A polynomial f in x\, . . . , x n and coefficients in field 
k is a finite linear combination of monomials, / = c v xl '> w i tn °v £ k. The set of all such polynomials is denoted 
k[x%, . . . , x n ]. For instance, hx\x\ — lx\ G k\x\, X2, Xs\. An n-graded ring is a ring R equipped with a decomposition 
of Abelian groups R = ® V R V , v G N n so that multiplication has the property R u ■ R v C R u + V . Elements in a single 
group Ru are called homogeneous. The set of polynomials form the n-graded polynomial ring, denoted A n . This ring 
is graded by A™ = kx v , v G N™. An n-graded module over an n-graded ring R is an Abelian group M equipped with 
a decomposition M = (B v M v , v G N™ together with a i?-module structure so that R u ■ M v C M u+V . An n-graded 
module is finitely generated if it admits a finite generating set. Also, recall the notion of a free module on an n-graded 
set and a basis for such a module 1 2 1 . 

Given a multifiltration {X u } u , the ith dimensional homology is the following n-graded module over A n 

@Hi(X u ), (7) 

u 

where the fc-module structure is the direct sum structure and x v ~ u : Hi(X u ) — > Hi(X v ) is the induced homomorphism 
Li (u,v) we described in the previous section. This view of homology yields two important results. In one dimension, 
the persistent homology of a filtration is easily classified and parameterized by the barcode, and there is an efficient 
algorithm for its computation |22|. In higher dimensions, no similar classification exists J2). Instead, we may utilize 
an incomplete invariant. For instance, the rank invariant defined above is provably equivalent to the barcode in one 
dimension, allowing us to compute incomplete information in higher dimensions. 
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3 One-Critical Multifiltrations 



Our interest in persistent homology stems from the need for analyzing scientific data. In this section, we begin by 
formalizing such data. We then show that topological analysis of scientific data naturally generates multifiltrations. In 
particular, it generates multifiltrations with the following property. 

Definition 1 (one-critical) A multifiltered complex K where each cell a has a unique critical coordinate u a is one- 
critical. 

In the rest of this paper, we assume that our input multifiltrations are one-critical. General multifiltrations, however, 
may not have this property. Therefore, we end this section by describing a classic construction that eliminates multiple 
critical coordinates in such input. 

3.1 Model for Scientific Data 

We are often given scientific data in the form of a set of samples, possibly noisy, from some underlying geometric 
space. At each sample point, we may also have measurements from the ambient space. For example, a fundamental 
goal in graphics is to render objects under different lighting from different camera positions. One approach is to 
construct a digitized model using data from a range scanner, which employs multiple cameras to sense 3D positions 
on an object's surface, as well as estimated normals and texture information |[T9l . An alternate approach samples 
the four-dimensional light field of a surface directly and interpolates to render the object without explicit surface 
reconstruction [ 15 1. Either approach gives us a set of noisy samples with measurements. Similarly, a node in a wireless 
sensor network has sensors on board that measure physical attributes of the local environment, such as pressure and 
temperature Ell . The GPS coordinates of the nodes give us a set of samples at which several functions are sampled. 

Formally, we assume we have a manifold X with d — 1 Morse functions defined on it |fl6l . In practice, X is often 
embedded within a high-dimensional Euclidean space R d , although this is not required. As such, we model the data 
using the following definition. 

Definition 2 (multifiltered data) We assume our input is multifiltered data, a finite set S of d-dimensional points 
along with n — 1 real-valued functions fj : S — > R, 1 < j < n — 1, defined at the samples, for n > 1. 

3.2 Construction 

We begin by approximating the underlying space X with a combinatorial representation, a complex. There are a 
variety of techniques for building such complexes, all of which have a scale parameter e, such as the Cech and Rips- 
Vietoris |[T2l complex which employ a global parameter, or the witness complex [7 1 which simulates a local parameter. 
As we increase e, a complex grows larger, and fixing a maximum scale e max gives us a filtered complex K with n 
cells. Each cell a G K enters K at scale e(cr). We formalize this type of complex next. 

Definition 3 (scale-filtered complex) A scale-filtered complex is the tuple (K, a), where K is a finite complex and 
e : K — > R, such that for any a G K, a G K u only for u > e(er). 

We now assume we have a scale-filtered complex K, e defined on our multifiltered point set. To incorporate the 
measurements at the points into our analysis, we first extend the sampled functions fj to the cells in the complex. For 
a G K and fj, let fj(cr) be the maximum value fj takes on cr's vertices; that is, fj(cr) = max„ £ff fj(v), where v is a 
vertex. This extension defines n — 1 functions on the complex, fj\K—> R. We now combine all filtration functions 
into a single multivariate function F : K — > R™, where 

F(a) = (fi(a)J 2 (a), f n -i(a), e(a)) . (8) 

We may now filter K via the excursion sets {K u } u of F: 

K u = {a G K | F(o) < u G R"}. (9) 

Note that simplex a enters K u at u — F{a) and will remain in the complex for all u > F(a). Equivalently, F(a) 
is the unique minimal critical coordinate at which a enters the filtered complex. That is, these multifiltrations are 
one-critical. 
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Example 1 (bifiltration criticals) The bifiltration in Figure[T]is one-critical, as each simplex enters at a single critical 
coordinate. For instance, F(a) = (1, 1), F(cde) = (3, 1), and F(af) = (1, 2). 

Since K is finite, we have a finite set of critical coordinates that we may project on each dimension j, getting a 
set finite set of critical values Cj. We now restrict ourselves to the Cartesian product C\ X . . . X C n of the critical 
values, parameterizing the resulting discrete grid using N in each dimension. It is clear that the required diagrams (Q]) 
commute, giving us a rf-dimensional multifiltration {K v } v with v G N n . 

We end by noting that one-critical multifiltrations may be represented compactly by the set of tuples 

{{a,F(a))\aeK}. (10) 

This representation is the main input to our algorithms in Section |4~3| 

3.3 Mapping Telescope 

In general, multifiltrations are not one-critical since a cell may have enter at multiple incomparable critical coordi- 
nates, viewing < as a partial order on N". For example, in Figure[U the vertex that enters at (1, 0) may also enter 
at (0, 1) as the two coordinates are incomparable. For such multifiltrations, we may utilize the mapping telescope, 
a standard algebraic construction to ensure that each cell has a unique critical coordinate fl4l . Intuitively, this con- 
struction introduces additional shadow cells into the multifiltration without changing its topology. We will not detail 
this construction here as none of the multifiltrations we encounter in practice require the conversion. We should note, 
however, that the mapping telescope increases the size of the multifiltration, depending on the number of cells with 
multiple critical points. In the worst case, the increase is exponential, but this requires strange constructions that are 
not topologically interesting and do not arise in practice. 

4 Using Computational Commutative Algebra 

Having described our input in the last section, in this section we recast the problem of computing multidimensional 
persistence as a problem within computational commutative algebra. We then describe standard algorithms from this 
area that solve our problem. While this gives us a solution, it comes at a computational cost, as these algorithms are 
intractable. In the next section, we refine them to derive polynomial-time algorithms. 

4.1 Multigraded Homology 

We begin by extending simplicial homology to a multifiltered simplicial complex. We then convert the computation 
of the latter to standard questions in computational commutative algebra. 

Definition 4 (chain module) Given a multifiltered simplicial complex {K u } u , the ith chain module is the n-graded 
module over A n 

C i = @C i (K u ), (11) 

u 

where the k-module structure is the direct sum structure and x v ~ u : Ci{K u ) — > Ci{K v ) is the inclusion K u K v . 

Note that we overload notation to reduce complexity by having Cj = Ci({K u }u) when the multifiltration is clear 
from context. The module C,; is n-graded as for any u 6 N n , (Ci) u = Ci(K u ). That is, the chain complex in grade u 
of the module is the chain complex of K u , the simplicial complex with coordinate u. 

Example 2 (bifiltration module) Consider the vertex d in the bifiltered complex in FigureQ] This vertex has critical 
coordinate (1, 0), so copies of this vertex exist in 9 complexes K u for u > (1, 0). The inclusion maps relate these 
copies within the complexes. In turn, polynomials relate the chain groups in the different grades of the module. Let 
d be the copy of the vertex in the critical coordinate (1, 0). Then, within Cj, we have d in grade (1, 0), x\d in grade 
(2, 0), x^d in grade (1, 1), x\x\d in grade (3, 2) and so on, as required by the definition of an n-graded module. In 
other words, a simplex has different names in different grades. 

The graded chain modules Cj are finitely generated, so we may choose bases for them. 



Preliminary Manuscript - 10/2/09 — Page 6 



Definition 5 (standard basis) The standard basis/or the i chain module Ci is the set ofi-simplices in critical grades. 



Example 3 (bifiltration bases) For our bifiltration in Figure Q] the standard basis corresponds to the highlighted and 
named simplices. For example, the standard basis for Co is 



grade 


(0,0) 


(1,0) (1,1) 


simplices 


b, c,e,f 


d a 



Note that in doing so, we have made a choice of basis. Unlike for chain groups, this choice has an important conse- 
quence: Our resulting calculations will not be invariant but depend on the initial basis. 

Recall that our multifiltrations are one-critical. The graded chain groups of one-critical multifiltrations are free: 
Since each cell enters only once, the resulting chain groups do not require any relations. Since our graded chain groups 
are free, the boundary operator is simply a homomorphism between free graded modules. Given standard bases, we 
may write the boundary operator di : Ci — » Cj_i explicitly as a matrix with polynomial entries. 

Example 4 (boundary matrix) For the bifiltration in Figure Q] d\ has the matrix 





ab 


be 


cd 


de 


e/ 


af 


bf 


ce 


a 
















X2 








b 


X\X% 


x\x\ 














xl 





c 





2 2 
X X X 2 


Xl 














X2 


d 








1 


1 














e 











Xl 


r 2 
x 1 








X2 


. / 














r 2 


x\x\ 


4 






where we assume we are computing over Z2. 

As in standard homology, the boundary operator connects the graded chain modules into a chain complex C* (Equa- 
tion[3]l and the ith homology module is defined exactly as before (Equation!!]): 

H, = kcr d t J im d i+1 (13) 

4.2 Recasting the Problem 

Our goal is to compute the homology modules. Following the definition, we have three tasks: 

1 . Compute the boundary module im <9 i+1 . 

2. Compute the cycle module ker di. 

3. Compute the quotient Hi . 

We next translate these three tasks into problems in computational commutative algebra. Both the boundary and cycle 
modules turn out to be submodules of free and finitely generated modules that consist of vectors of polynomials. For 
the rest of this paper, we assume that we are computing homology over the field k. Recall from Section l2~4l that our 
module is defined over the n-graded polynomial ring A n = k[x\, . . . , x n ] with standard grading A™ = kx v , v € N n . 
For notational simplicity, we will use B = A n to denote this ring for the remainder of this section. 

Let R m be the Cartesian product of m copies of R. In other words, R rn consists of all column m-vectors of 
polynomials: 

R m = {[h,---J m ] T \h^RA<i<m}. (14) 

To distinguish elements of R m from polynomials, we adopt the standard practice of placing them in bold format, so 
that f G R m is a vector of polynomials, but / S R is a polynomial. We use this practice exclusively for elements of 
R m and not for other vectors, such as elements of N™. We now recast the three problems above into the language of 
computational commutative algebra. 
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1. The boundary module is a submodule of the polynomial module. The matrix for di+i has to$ rows and 7Bj+i 
columns, where rrij denotes the number of j-simplices in the complex. Let F = (fi, . . . , f mi+ i), fi £ R mi , 
where fi is the zth column in Mj+i. This tuple of polynomial vectors generate a submodule of R mi 

{rrn+i ) 
J2 «&fj I Qi G R | . (15) 

The Submodule Membership Problem asks whether a polynomial vector f is in a submodule M; That is, whether 
we may write f in terms of some basis F as above. A solution to this problem would complete our first task. 

2. The cycle submodule is also a submodule of the polynomial module. The matrix for 9, has m,_i rows and m. L 
columns. Let F — (fj, . . . , f mi ), fi G i? mi_1 , where fi is the ith column in the matrix. Given F, the set of all 

[qi, q mi ] T , q% e -R such that 

i=l 

is a i?-submodule of i£ m< called the (first) syzygy module of (fi, . . . , f mi ), denoted Syz(fi, . . . , f mi )- A set of 
generators for this submodule would complete our second task. 

3. Our final task is simple, once we have completed the first two tasks. All we need to do is test whether the 
generators of the syzygy submodule, our cycles, are in the boundary submodule. As we shall see, the tools 
which allow us to complete the first two tasks also resolve this question. 



4.3 Algorithms 

In this section, we begin by reviewing concepts from commutative algebra that involve the polynomial module R m 
We then look at algorithms for solving the submodule membership problem and computing generators for the syzygy 
submodule. In our treatment, we follow Chapter 5 of Cox, Little, and O'Shea f6l . 

The standard basis for R m is {ei, . . . , e m }, where ei is the standard basis vector with constant polynomials 
in all positions except 1 in position i. We use the "top down" order on the standard basis vectors, so that ei > ej 
whenever i < j. A monomial m in R m is an element of the form x u ei for some i and we say m contains e\. 

For algorithms, we need to order monomials in both R and R m . For u, v E N n , we say u >\ ex v if the vector 
difference u — v 6 Z n , the leftmost nonzero entry is positive. The lexicographic order >/„ is a total order on W 1 . 
For example, (1, 3, 0) >i ex (1, 2, 1) since (1, 3, 0) — (1, 2, 1) = (0, 1, —1) and the leftmost nonzero entry is 1. Now, 
suppose x u and x v are monomials in R. We say x u >i ex x v if u >i ex v. This gives us a monomial order on R. We 
next extend >i ex to a monomial order on R m using the "position-over-term" (POT) rule: x u ei > x v e^ if i < j, or if 
i = j and x u >i ex x v . Now, every element f e R m may be written, in a unique way, as a fc-linear combination of 
monomials mi, 

m 

f = 5^c i mi ) (17) 

i=i 

where Cj € fc, Cj ^ and the monomials rtii are ordered according to the monomial order. We now define: 

• Each Cjirii is a term of f . 

• The leading coefficient of f is LC(f) = C\ £ k. 

• The leading monomial of f is LM(f ) = mi. 

• The leading term of f is LT(f ) = cimi. 

Example 5 Let f = [5a;ia;|, 2xi — 7xg] T S R 2 . Then, we may write f in terms of the standard basis (Equation[T7l): 

f = 5[ Xl xl 0] T - 7[0, x 3 3 ] T + 2[0, Xl ] T (18) 
= dxixlex - 7xle 2 + 2xie 2 . (19) 

From the second line, the monomials corresponding to sum ( [T7| > are mi = xia^ei, m2 = x\e 2 , and ni3 = xie 2 . 
The second term of f is 7[0, Xgjand we have Lc(f ) = 5, LM(f ) = x\x\, and LT(f ) = hx\x\. 
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Finally, we extend division and least common multiple to monomials in R and R m . Given monomials x u , x v G R, 
if x u >ie X x v , then x° divides x u with quotient x u /x v — x u ~ v . Moreover, we define w G N n by W{ = max(ui,Vi) 
and define the monomial x w to be the least common multiple of x u and x v , denoted LCM(x u , x v ) = x w . Next, given 
monomials m = x u e\ and n = x v ej in R m , we say n divides m iff i = j and x v divides x u , and define the quotient 
to be m/n = x u /x v = a;" - ". In addition, we define 

LCMfa^ei^ej) = ( LCM^ar")^, i = j, 
x ■" [0, otherwise. 

Clearly, the LCM of two monomials is a monomial in R and R m , respectively. 

Example 6 Let f = [x±, xiX2] T and g = [x2, 0] T be elements of R 2 . Then, the LCM of their leading monomials is: 

LCM(LM(f),LM(g)) = LCM(xiei, X 2 ei) (21) 

= x^ex- (22) 

We now turn our attention to the submodule membership problem. Formally, given a polynomial vector f and a 
set of t polynomial vectors F, we would like to know whether / G (F). We may divide f by F using the division 
algorithm DIVIDE in Figure|2] After division, we have f = (Yli=i Q&) + r ' so ^ tne remainder r = 0, then f G (F). 
This condition, however, is not necessary for modules over multivariate polynomials (n > 1) as we may get a non-zero 
remainder even when / G (F). 

Let M be an submodule and LT(M) be the set of leading terms of elements of M. A Grobner basis is a basis 
G C M such that (lt(G)} = (lt(M)}. For Grobner bases, whenever / G (F), we always get r = after division of 
f by (F), so we have solved the membership problem. The BUCHBERGER algorithm in Figure[3]computes a Grobner 
basis G starting from any basis F. The syzygy polynomial vector or S -polynomial S(f, g) G R m of f and g is 

* (f ' g) = ^ f -^g) g < ^ (23) 

h = LCM(LM(f ), LM(g)). (24) 

The BUCHBERGER algorithm utilizes 5-polynomials to eliminate the leading terms of polynomial vectors and com- 
plete the given basis to a Grobner basis. 

A Grobner basis generated by the algorithm is neither minimal nor unique. A reduced Grobner basis is a Grobner 
basis G such that for all g G G, Lc(g) = 1 and no monomial of g lies in (lt(G — {g}). A reduced Grobner 
basis is both minimal and unique. We may compute a reduced Grobner basis by reducing each polynomial in G in 
turn, replacing g G G with the remainder of Divide(i?, G — {<?}). Since the algorithm is rather simple, we do not 
present pseudo-code for it. The DIVIDE, BUCHBERGER, and the reduction algorithms together solve the submodule 
membership problem and, in turn, our first task of computing imdj+i. 



DlVlDE(f,(f 1) ...,f t )) 

1 p <- f , r <- 

2 for i <— 1 to t 

3 do q t <— 

4 while p^O 

5 do if LT(fi) divides LT(p) for some i 

6 then q t «- q t + lt (p) / lt (fi ) 

7 p«-p-(LT(p)/LT(ft))f, 

8 else r <— r + LT(p) 

9 p «- p LT(p) 

10 return ((q 1} ... ,q t ),r) 



Figure 2: The DIVIDE algorithm divides f G R m by an t-tuple (fi , . . . , ft), fi G R m to get f = (E™ i 9* f i) + r - where 
qi € R and r G R m . 
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Buchberger(F) 

1 G <- F 

2 repeat G <- G 

3 foreach pair f ^geG 

4 (( g i,...,g t ),r)^DlVlDE(S(f,g),G) 

5 if r ^ 

6 then G^GUjr} 

7 until G = G' 

8 return G 



Figure 3: The algorithm BUCHBERGER completes a given basis F to a Grobner basis G by incrementally adding the remain- 
ders of S-polynomials, divided by the current basis. 



We next compute generators for the syzygy submodule to complete our second task. We begin by computing a 
Grobner basis G — {gi, . . . , g s } for (F), where the vectors are ordered with some monomial order. We then compute 
DlVlDE(S'(gi, gj), G) for each pair of Grobner basis elements. Since G is a Grobner basis, the remainder of this 
division is 0, giving us 

s 

S(Si,Sj) =^2lijkSk- (25) 

fc=i 

Let ei , . . . , e s be the standard basis vectors in R s and let 

hy = LCM(LT(gi,gj)), (26) 

<uj = J2 q » k€k eRS - (27) 

k=l 

For pairs such that hy ^ 0, we define Sy £ R s by 

sy = -^et - -^T-,e> - qy 6 R s , (28) 

LT(gi) LT(gj) 

with sy = 0, otherwise. Schreyer's Theorem states that the set {sy}y form a Grobner basis for Syz(gi, . . . , g s ) [6 
Chapter 5, Theorem 3.3]. Clearly, we may compute this basis using DIVIDE. We use this basis to find generators for 
Syz(fi, . . . ,f t ). 

Let M f and Mq be the mxt and mxs matrices in which the fj 's and gi's are columns, respectively. As both bases 
generate the same module, there is a t x s matrix A and an s x t matrix B such that Mq — M fA and M f = MqB. 
To compute A, we simply initialize A to be the identity matrix and add a column to A for each division on line 4 
of BuCHBERGERthat records the pair involved in the 5-polynomial. The matrix B may be computed by using the 
division algorithm. To see how, notice that each column of Mp is divisible by Mq since Mq is a Grobner Basis for 
Mp\ now there is a column in B for each column fi £ Mp, which is obtained by division of fi by Mq. Let Si, . . . , St 
be the columns of the t x t matrix I t — AB. Then, 

Syz(f 1 ,...,f t ) = (Asy, Sl ,...,s t ), (29) 

giving us the syzygy generators [6 , Chapter 5, Proposition 3.8]. We refer to the algorithm sketched above as Schreyer's 
algorithm. This algorithm completes our second task. The third task was to compute the quotient Hi given im di+\ = 
G and kcrdi = Syz(fi, . . . ,ft). We simply need to find whether the columns of kerdi can be represented as a 
combination of the columns of the im di+i . Now, Hi may be computed using the division algorithm. We divide every 
column in the ker<9; by the matrix imdi+i using the DIVIDE algorithm. If the remainder is non-zero, we add the 
remainder both to im<9i + i and Hi(so as to count only unique cycles). 

While the above algorithms solve the membership problem, they have not been used in practice due to their com- 
plexity. The submodule membership problem is a generalization of the Polynomial Ideal Membership Problem (PIMP) 
which is ExPSPACE-complete, requiring exponential space and time IT71I20I . Indeed, the Buchberger algorithm, in its 
original form, is doubly-exponential and is therefore not practical. 



Preliminary Manuscript - 10/2/09 — Page 10 



5 Multigraded Algorithms 



In this section, we show that multifiltrations provide additional structure that may be exploited to simplify the al- 
gorithms from the previous section. These simplifications convert these intractable algorithms into polynomial time 
algorithms. 



5.1 Exploiting Homogeneity 

The key property that we exploit for simplification is homogeneity. 

Definition 6 (homogeneous) Let M be anm x n matrix. The matrix M is homogeneous iff 

1. every column ( row) f of M is associated with a coordinate Uf and corresponding monomial x Ui , 

2. every non-zero element Mjk may be expressed as the quotient of the monomials associated with column k and 
row j, respectively. 

Any vector f endowed with a coordinate Uf that may be written as above is homogeneous, e.g. the columns of M. 

We will show that all boundary matrices di may be written as homogeneous matrices initially, and the algorithms 
for computing persistence only produce homogeneous matrices and vectors. That is, we maintain homogeneity as an 
invariant throughout the computation. We begin with our first task. 



Lemma 1 For a one-critical multifdt ration, the matrix of di : Ci 
homogeneous. 



Cj_i written in terms of the standard bases is 



Proof: Recall that we may write the boundary operator di : Ci — ► explicitly as a rrii-i x m., matrix M in 

terms of the standard bases for Ci and Cj_i, as shown in matrix dT2b for d\. From Definition [5] the standard basis 
for Ci is the set of i-simplices in critical grades. In a one-critical multifiltration, each simplex a has a unique critical 
coordinate u a (Definition (Q3). In turn, we may represent this coordinate by the monomial x Ua . For instance, simplex 
a in Figure[T]has critical grade (1,1) and monomial a^ 1 ' 1 ) = X1X2. We order these monomials using >i ex and use this 
ordering to rewrite the matrix for di. The matrix entry Mjk relates <jfe, the fcth basis element for Ci to frj, the jth basis 
element for Ci-%. If Oj is not a face of 07., then Mjk = 0. Otherwise, frj is a face of a^. Since a face must precede a 
co-face in a multifiltration, 



X U °« >lex X Ua > 

Mj k = X u "h/x Us i =x U ^~ Us i. 

That is, the matrix is homogeneous. 

For example, u\ = a is a face of u\ = ab, so M\\ = x\x\jx\X2 = X2 in matrix (l33l >. 



(30) 
(31) 
(32) 
□ 



Coroliary 1 For a one-critical multifiltration, the boundary matrix di in terms of the standard bases has monomial 
entries. 

Proof: The result is immediate from the proof of the previous lemma. The matrix entry is either 0, a monomial, or 

x u ^ CTk ' u ^ii, a monomial. □ 

Example 7 We show the homogeneous matrix for d\ below, where we augment the matrix with the associated mono- 
mials. Again, we assume we are computing over Z2. 







ab 


be 


cd 


de 


ef 




bf 


ce 








x\x\ 


Xl 


Xl 


xl 


xix'i 


xl 


X2 


a 


X1X2 


X2 














Xl 








d 


Xl 








1 


1 














b 


1 


X\X% 


2 2 
x 1 x 2 














xl 





c 


1 





2 2 
X X X 2 


Xl 














XI 


e 


1 











Xl 


xl 








x 2 


. / 


1 














xl 


XlX% 


xl 






(33) 
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We next focus on our second task, showing that given a homogeneous matrix as input, the algorithms produce 
homogeneous vectors and matrices. Let F be an m x n homogeneous matrix. Let {ei, . . . , e m } and {ei, . . . , e n } be 
the standard bases for R m and R n , respectively. A homogeneous matrix associates a coordinate and monomial to the 
row and column basis elements. For example, since X\ is the monomial for row 2 of matrix d33l l, we have u e2 = (1,0) 
and x Ue 2 = x\. Each column f in F is homogeneous and may be written in terms of rows: 

i—l 

where c; G k and we allow Cj = when a row is not used. For instance, column g representing the edge ab in the 
bifiltration shown in Figure[T]may be written as: 

g = x 2 e 1 + x 2 x\e 3 (35) 

_ X^l^ + X2XI B3 (36) 



XiX 2 1 



X s X s x X 

X e i X e 3 — ' X 



e{i,3} 



Consider the BUCHBERGER algorithm in Figure [3] The algorithm repeatedly computes S'-polynomials of homo- 
geneous vectors on line 4. 

Lemma 2 The S-polynomial S(f, g) of homogeneous vectors f and g is homogeneous. 

Proof: A zero 5-polynomial is trivially homogeneous. A non-zero ^-polynomial S(f , g) implies that h in Equa- 
tion d24i i is non-zero. By the definition of LCM (Equation l20li. this implies that the leading monomials of f and g 
contain the same basis element ej. We have: 

x Uf 

LM(f) = — ej (38) 

LM(g) - (39) 

x i 

h = LCM(LM(f ), LM(g)) (40) 

= lcm — ,— ej (41) 

V x j x J / 



LCM (x Uf ,X Us ) 



~Letx e — LCM(x Uf ,x u s) = X LCM( - Uf ' u s) ? giving us h = -^j-ej. We now have 



(42) 



X J J 



LT(f) Cf ^ ej 



CfX" f 

where Cf 7^ is the field constant in the leading term of f . Similarly, we get 

h xr 



(43) 
(44) 



, Cg^O (45) 

lt( 5 ) c g x u z s 



Preliminary Manuscript - 10/2/09 — Page 12 



Putting it together, we have 



Vc— e —X^ c >— e - (47) 

1 i=l 8 i=l 

m £ 



where di — Ci/cf — c^/c g . Comparing with Equation d34l ), we see that 5(f , g) is homogeneous with «s(f,g) = i- D 

Having computed the ^-polynomial, BUCHBERGER next divides it by the current homogeneous basis G on line 4 
using a call to the DIVIDE algorithm in Figure|2] 

Lemma 3 DlVlDE(f , (fi , . . . , ft)) returns a homogeneous remainder vector r for homogeneous vectors f , fj £ R m . 

Proof: On line 1, r and p are initialized to be and f, respectively, and are both trivially homogeneous. We will 
show that each iteration of the while loop starting on line 4 maintains the homogeneity of these two vectors. On line 
5, since both fi and p are homogeneous, we have 



p = H^5^r e j- (50) 



3=1 

Since LT(fi) divides LT(p), the terms must share basis element and we have 



LT(fj) = c lfc -^e k (51) 

X e k 

LT(p) = d k ^-e k (52) 

X k 

LT(p)/LT(fO = ^~, (53) 

Cik % 1 



where x u *> >\ ex x Uf i so that the division makes sense. On line 7, p is assigned to 



p - (LT(p)/ LT(fi))fi = ^ dj -^e } - \-- • — J ^ ^ej (54) 



3=1 x 7 3=1 



^ V C ik J X J 



3=1 

a; 



Ex p 
J IE J 

3=1 

where g^- = dj — dk ■ Cij jc^ and d' k = 0, so the subtraction eliminates the kth term. The final sum means that p is now 
a new homogeneous polynomial with the same coordinate u p as before. Similarly, LT(p) is added to r on line 8 and 
subtracted from p on line 9, and neither action changes the homogeneity of either vector. Both remain homogeneous 
with coordinate u p . □ 

The lemmas combine to give us the desired result. 
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Theorem 1 (homogeneous Grobner) The BUCHBERGER algorithm computes a homogeneous Grobner basis for a 
homogeneous matrix. 



Proof: Initially, the algorithm sets G to be the set of columns of the input matrix F, so the vectors in G are homo- 
geneous by Lemma Q] On line 4, it computes the S'-polynomial of homogeneous vectors f,g £ G. By Lemma [2] 
the S'-polynomial is homogeneous. It then divides the S-polynomial by G. Since the input is homogeneous, DIVIDE 
produces a homogeneous remainder r by Lemma [3] Since only homogeneous vectors are added to G on line 6, G 
remains homogeneous. We may extend this result easily to the reduced Grobner basis. □ 

Using similar arguments, we may show the following result, whose proof we omit here. 

Theorem 2 (homogenous syzygy) For a homogeneous matrix, all matrices encountered in the computation of the 
syzygy module are homogeneous. 

5.2 Data Structures and Optimizations 

We have shown that the structure inherent in a multifiltration allows us to compute using homogeneous vectors and 
matrices whose entries are monomials only. We next explore the consequences of this restriction on both the data 
structures and complexity of the algorithms. 

By Definition ©, an m x n homogeneous matrix naturally associates monomials to the standard bases for R m 
and R n . Moreover, every non-zero entry of the matrix is a quotient of these monomials as the matrix is homogeneous. 
Therefore, we do not need to store the matrix entries, but simply the field elements of the matrix along with the 
monomials for the bases. We may modify two standard data structures to represent the matrix. 

• linked list: Each column stores its monomial as well as a linked-list of its non-zero entries in sorted order. The 
non-zero entries are represented by the row index and the field element. The matrix is simply a list of these 
columns in sorted order. Figure [4] displays matrix d33l l in this data structure. 

• matrix: Each column stores its monomial as well as the column of field coefficients. If we are computing over 
a finite field, we may pack bits for space efficiency. 

The linked-list representation is appropriate for sparse matrices as it is space-efficient at the price of linear access 
time. This is essentially the representation used for computing in the one-dimensional setting ll22l . In contrast, the 
matrix representation is appropriate for dense matrices as it provides constant access time at the cost of storing all zero 
entries. The multidimensional setting provides us with denser matrices, as we shall see, so the matrix representation 
becomes a viable structure. 

In addition, the matrix representation is optimally suited to computing over the field Z2, the field often commonly 
employed in topological data analysis. The matrix entries each take one bit and the column entries may be packed 
into machine words. Moreover, the only operation required by the algorithms is symmetric difference which may be 
implemented as a binary XOR operation provided by the chip. This approach gives us bit-level parallelism for free: 
On a 64-bit machine, we perform symmetric difference 64 times faster than on the list. The combination of these 
techniques allow the matrix structure to perform better than the linked-list representation in practice. 

We may also exploit homogeneity to speed up the computation of new vectors and their insertion into the basis. 
We demonstrate this briefly using the BUCHBERGER algorithm. We order the columns of input matrix G using the 
POT rule for vectors as introduced in Section|4] Suppose we have f,g e G with f > g. If S(f , g) 7^ 0, LT(f) and 




Figure 4: The linked list representation of the boundary matrix di of Equation[l2] for the bifiltration shown in Figure[TJ 
in column sorted order. Note that the columns in Equation [T2] are not ordered while they are sorted correctly here. 
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LT(g) contain the same basis, which the S'-polynomial eliminates. So, we have S(f, g) < g < f. This implies that 
when dividing S(f , g) by the vectors in G, we need only consider vectors that are smaller than g. Since the vectors 
are in sorted order, we consider each in turn until we can no longer divide. By the POT rule, we may now insert the 
new remainder column here into the basis G. This gives us a constant time insertion operation for maintaining the 
ordering, as well as faster computation of the Grobner basis. 

5.3 Complexity 

In this section, we give simple polynomial bounds on our multigraded algorithms. These bounds imply that we may 
compute multidimensional persistence in polynomial time. 

Lemma 4 Let F be an m x n homogeneous matrix of monomials. The Grobner basis G contains 0(n 2 m) vectors in 
the worst case. We may compute G using BUCHBERGER in 0(n m 3 ) worst-case time. 

Proof: In the worst case, F contains nm unique monomials. Each column f G F may have any of the nm monomials 
as its monomial when included in the Grobner basis G. Therefore, the total number of columns in G is 0(n 2 m). In 
computing the Grobner Basis, we compare all columns pairwise, so the total number of comparisons is 0(n 4 m 2 ). 
Dividing the 5-polynomial takes 0(m) time. Therefore, the worst-case running time is 0(n 4 m 3 ). □ 

In practice, the number of unique monomials in the matrix is lower than the worst case. In computing persistence, for 
example, we may control the number of unique monomials by ignoring close pairs of gradings. The following lemma 
bounds the basis size and running time in this case. 

Lemma 5 Let F be an m x n homogeneous matrix with h of unique monomials. The Grobner basis G contains 
0(hn) vectors and may be computed in time 0(n 3 h 2 ). 

The proof is identical to the previous lemma. 

Lemma 6 Let F be an m x n homogeneous matrix of monomials and G be the Grobner Basis of F. The syzygy 
module S for G may be computed using Schreyer's algorithm in 0{n m 2 ) worst-case time. 

Proof: In computing the syzygy Module, we compare all columns of G pairwise, so the total number of comparisons 
is 0(n 4 m 2 ). Dividing the 5-polynomial takes 0(m) time. Therefore, the worst-case running time is 0(n 4 m 3 ). □ 

Theorem 3 Multidimensional persistence may be computed in polynomial time. 

Proof: Multidimensional persistence is represented by the Grobner Bases and the syzygy moduli of all the homoge- 
neous boundary matrices di for a given multifiltration. In the previous lemmas, we have shown that both the Grobner 
Basis and the syzygy module can be computed in polynomial time. Therefore, one can compute multidimensional 
persistence in polynomial time. □ 

In other words, our optimizations in this section turn the exponential-algorithms from the last section into polynomial- 
time algorithms. 

6 Computing the Rank Invariant 

Having described our algorithms, in this section we discuss the computation of the rank invariant. Recall that our 
solution is complete, but not an invariant. In contrast, the rank invariant (Equation|5]l is incomplete, but is an invariant 
and may be used, for instance, as a descriptor in order to compare and match multifiltrations. We begin with a naive 
algorithm that computes the invariant for each pair independently. We then discuss alternate approaches using posets 
and vineyards. Direct computation, however, suffers from the storage requirement, which is intractable. Therefore, 
we end this section by giving an algorithm for reading off the rank invariant from the solution computed using our 
multigraded algorithms. 



Preliminary Manuscript - 10/2/09 — Page 15 



6.1 Direct Computation 



We assume we are given a rt-dimensional multifiltration of a cell complex K with m cells. Recall the rank invariant, 
Equation (0, from Section [2] We observe that any pair u < v € N™ defines a one-dimensional filtration: We let t be 
a new parameter, map u to t = 0, v to t = 1, and obtain a two-level filtration. We then use the persistence algorithm 
to obtain barcodes ll22l . The invariant u) may be read off from the 0i -barcode: It is the number of intervals that 
contain both and 1. The persistence algorithm is 9(m 3 ) in the worst-case, so we have a cubic time algorithm for 
computing the rank invariant for a single pair of coordinates. 

To fully compute the rank invariant, we need to consider all distinct pairs of complexes in a multifiltration. It 
may seem, at first, that we need to only consider critical coordinates, such as (1, 1) and (2, 0) in the bifiltration in 
Figure Q] However, note that the complex at coordinate (2, 1) is also distinct even though no simplex is introduced 
at that coordinate. Inspired by this example, we may devise the following worst-case construction: We place m/n 
cells on each of the n axis to generate (m/n) n = 8(m") distinct complexes. Simple calculation shows that there are 
0(m 2 ™) comparable coordinates with distinct complexes. For each pair, we may compute the rank invariant using our 
method above for a total of 0(m 2n+3 ) running time. To store the rank invariant, we also require 8(rn 2 ™) space. 

6.2 Alternate Approaches 

Our naive algorithm above computes the invariant for each pair of coordinates independently. In practice, we may read 
off multiple ranks from the same barcode for faster calculation. Any monotonically increasing path from the origin to 
the coordinate of the full complex is a one-dimensional filtration, such as the following path in Figure Q] 

(0,0)->(l,l)->(2,2)-(3,2) (57) 

Having computed persistence, we may read off the ranks for all six comparable pairs within this path. We may 
formalize this approach using language from the theory of partially ordered sets. The path described above is a 
maximal chain in the multifiltration poset: a maximal set of pairwise comparable complexes. We require a set of 
maximal chains such that each pair of comparable elements (here, complexes) are in at least one chain. Each maximal 
chain requires a single one-dimensional persistence computation. We now require an algorithm that computes the 
smallest set of such chains. 

Another approach is to use Vineyards as introduced in [4|. The basic idea is the following: given two functions, 
compute a homotopy parameterized by A; for each A compute the persistence barcodes and finally study the stack of 
these barcodes. A drawback of this approach is that a homotopy of functions might suffer from range compression. 
Consider a topological feature which exists for a long range in both the filtrations, but it is possible that the range for 
which the feature exists in the homotopy is rather small. Thus studying the vineyard, one would see this feature for a 
very short while, even though the feature existed for a long range in both the filtrations. 

6.3 Multigraded Approach 

Regardless of the approach, the direct computation is hampered by the exponential storage requirement, motivating 
our work in computing the rank invariant indirectly. Therefore, we first compute multidimensional persistence using 
our multigraded algorithms in Section [5] We then simply read off the rank invariant using the Rank algorithm, as 
shown in Figure [5] We describe the algorithm in the proof of the following theorem. 

Theorem 4 Rank(Z, B, u, v) computes the rank invariant pi(u, v), if Z is the syzygies of di and B is the Grobner 
basis for 

Proof: The algorithm uses two simple helper procedures. The procedure PROMOTE takes a matrix M and coordinate 
u as input. It then finds the columns f G M whose associated coordinate Uf precedes u, and promotes them to 
coordinate u by a simple shift. The procedure QUOTIENT finds the quotient of the input matrices by division: If the 
remainder r is non-zero, it adds r to the quotient Q, also adding it to B so it only find unique cycles. Now assume 
the input are as in the statement of the theorem. By the definition of the rank invariant, we need to count homology 
cycles that exist at u and persist until v. The Rank algorithm implements this. We compute homology H u at u on the 
first three lines. On line|4] we promote these cycles to coordinate v. We then quotient with boundaries B v at v to find 
homology cycles H uv that exist at u and persist until v. The size of this set is the rank invariant by definition. □ 
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Rank(Z, B,u,v) 

1 Z u <— Promote (Z, u) 

2 B u <— Promoters, u) 

3 H u ^ Quotient^, B u ) 

4 Z uv <— Promote (H u ,v) 

5 B v «— Promote(£?,i;) 

6 <- QUOTIENT(Z nl , , 

7 return|i/ u „| 

Quotient^, B) 

1 Q^0 

2 foreach f e Z 

3 do ((gi,...,g t ),r) ^DiviDE(f,S) 

4 if r ^ 

5 then Q^QU{r} 

6 B^BU {r} 

7 return Q 

Promote(M, m) 

1 return {— f | f G AI,Uf < u\ 



Figure 5: The algorithm RANK computes the rank invariant Pi(u, v) if Z is the set of syzygies of di and B is the Grobner 
basis for 9j+i. The procedure QUOTIENT finds the quotient of Zby B using the DIVIDE algorithm. The procedure PROMOTE 
promotes cycles that exist before time u to that time. 



In this section, we describe our implementation as well as initial quantitative experiments that show the performance 
of our algorithms in practice. We end with a last look at our example bifiltration in Figure [T] computing its rank 
invariant using our multigraded algorithms. 

7.1 Implementation 

We initially used software packages CbCbA[3| and Macaulay |TT|, which contain standard implementations of the 
algorithms. These packages were immensely helpful during our software development as they allowed for quick 
and convenient testing of the basic algorithms. In practice, there are two problems in using these packages for large 
datasets. First, these packages are slow since they are general and not customized for homogeneous matrices. Second, 
these packages produce verbose output that must be parsed for further computation. 

Our experience led us to implement our algorithms for computation over Z2, optimizing the code for this field. 
Our implementation is in Java and and was tested under Mac OS X 10.5.6 running on a 2.3Ghz Intel Quad-Core Xeon 
MacPro computer with 6GB RAM. 



We generate n x n, random, bifiltered, homogeneous matrices, to simulate the boundary matrix dk-i of a random 
bifiltered complex with n simplices in dimensions k — 1 and k — 2. We use the following procedure: 

1. Randomly generate n monomials {mi, . . . ,m n } corresponding to the monomials associated with the basis 
elements of the rows. 

2. For each column f generate k integers indexing the non-zero rows. 




7 Experiments 



7.2 Data 
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3. Set the column monomial to be LCM(mj), where {rrij}j are the monomials of rows with non-zero 

Each column in this matrix has k non-zero elements and is homogeneous by construction. We also generate random 
matrices but limit the number of unique monomials in the matrix to be 0(h 2 ) for different values of h. The basic 
idea behind these tests is that the range of the filtrations in a simplicial complex can typically be divided into smaller 
discrete intervals. For generation, we replace the first step of the procedure above with the following two steps: 

0. Randomly generate h unique monomials {h, . . . , 1^}. 

1. Generate n monomials {mi, . . . , m n } corresponding to the monomials associated with the basis elements of 
the rows such that ro» € {Zi, . . . , h.}- 

After executing step 2 and 3 above, our resulting matrix has homogeneous columns with k non-zero elements and at 
most h 2 unique monomials. 



7.3 Size & Timings 

According to Lemma|4] the number of columns in the Grobner basis for a random matrix may grow 0(n 3 ) (we have 
n = m here.) Figure [6(a)] shows that the growth of the Grobner basis is less in practice (about linear for k = 2 and 
quadratic for k = 4) and increases as the matrix becomes denser. Similarly, the theoretical running time for this matrix 
is 0(n 7 ). Figure [6(b)| demonstrates that the actual running time matches this bound quite well (about 0(n 6 ) in these 
tests.) The matrix method, however, is considerably more efficient, about 20 times faster for our largest test here. 
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Figure 6: Random n x n matrices with k non-zero entries in each column, (a) The number of columns \G\ in Grobner basis 
G (b) Running time in seconds for computing multidimensional persistence using list (1) or matrix (m) data structures. 

We next limit the number of unique monomials in the input matrices. Figures [7] and [8] give the size and running 
time for matrices with at most h 2 monomials for h = 20 and h = 100, respectively. We see that the growth of the basis 
is about linear for different values of k and h, and the running time matches the theoretical 0(n?) bound in Lemma|5] 
quite well. 



7.4 Rank Invariant 

We end this paper by revisiting our motivating bifiltration from Figure [TJ and computing its multidimensional persis- 
tence and rank invariants using our algorithms. Using the natural ordering on the simplices, one can write the boundary 
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number of columns \G\ in Grobner basis G. (b) Running time in seconds for computing multidimensional persistence using 
list (1) or matrix (m) data structures. 



matrices as: 
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The Grobner Basis {G\) and the set of syzygies (Zi) for d\ are: 
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Note that each row in the syzygy matrix corresponds to an edge in the appropriate order. Finally, the Grobner Basis 
for is G2 = 82 (since the only possible ^-polynomial is identically 0). 
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Figure 8: Random n x n matrices with k non-zero entries in each column and a total of h 2 monomials for h — 100. (a) The 
number of columns \G\ in Grobner basis G. (b) Running time in seconds for computing multidimensional persistence using 
list (1) or matrix (m) data structures. 



Using Gi, Z\ and G2, one can read off the rank invariants for various u and v using the Rank algorithm in 
Section 1631 A few interesting rank invariants for this example are: 
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8 Conclusion 

In this paper, we give a full treatment to the computation of multidimensional persistence, from theory to imple- 
mentation and experiments. We develop polynomial time algorithms by recasting the problem into computational 
commutative algebra. Although the recast problem is ExPSPACE-complete, we exploit the multigraded setting to 
develop practical algorithms. The Grobner bases we construct allow us to reconstruct the entire multidimensional 
persistence vector space, providing us a convenient way to compute the rank invariant. We implement all algorithms 
in the paper and show that the calculations are feasible due to the sparsity of the boundary matrices. 

For additional speedup, we plan to parallelize the computation by batching and threading the XOR operations. 
We also plan to apply our algorithms toward studying scientific data. For instance, for zero-dimensional homology, 
multidimensional persistence corresponds to clustering multiparameterized data, This gives us a fresh perspective, as 
well as a new arsenal of computational tools, to attack an old and significant problem in data analysis. 
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